Report - Satellite Usage

Context

Since the launch of SPUTNIK 1 in 1957, thousands of satellites and other payloads have launched for applications ranging from telecommications, earth observation, navigation, military, exploration and many more. This report compiled 60 years on, explores and visualise some of this history and give an insight into the status of the satellite population today.

Complete history of launch records since 1957 are combined with a database of known Operational satellites. Both sources are merged based on the unique NORAD identifier allocated by USSPACECOM.

After launch payloads are ejected into space where they operate at different orbital altitudes depending on their mission. The most commeon orbits are referred to as LEO (20-1200km), MEO (1200-36,000km) and GEO (36,000km). GEO orbit has the same orbital period as earths rotation so that it appears stationary from the earth which makes it useful for broadcast and telecommunications applications.

GEO satellites are positioned in a belt around the equator (known as the Geostationary arc), typically separated by at least 2 degrees longitude unless they are operated as part of a stack at the same location. LEO, and MEO satellites have a orbital period less than 24 hours so the orbit is typically described only by the altitude.

At the end of their operational life, objects gradually decay in altitude until eventually they return to the earth in a ball of flames. Alternatively, higher orbit objects are pushed further out into space ideally into graveyard orbit.

For the purposes of this report, it is only useful to understand terminology used is LEO, MEO, GEO. Further reading on orbits is available at https://en.wikipedia.org/wiki/Satellite#Orbit_types

Study Aims

Objective of this phase of the study is to compile a custom subset of records for commercial telecommunications satellites in orbit. We will later use these records we can answer some fundamental questions regarding the number of GEO telecommunications satellites, which countries/operators have launched the most, and how many operational spacecraft lifespace has increased over the last 50 years. The study targets only large (>50kg), GEO Commercial Communications Satellites of which the number of operational satellite will be determined.

Datasource 1 : Satellite Catalog (NORAD) Database

URL : https://celestrak.com/pub/satcat.txt

This database is a timestamped log of objects placed into space since 1957 to date.

Useful information includes :

  • NORAD identifier,
  • Launch Date,
  • Decay Date (where available),
  • Altitude in Km (called Apogee and Perigee),
  • Payload Flag
  • Launch Site and Country of Ownership

The database contains over 43,000 entries, is regularly maintained soon after each event (within days)

Raw data is ascii space delimited text, read directly from the source url. Each row is a fixed length string of 132 characters. There are 14 tab-delimited columns. Each column contains a different character format representing one or more data descriptors in accordance with the format description https://www.celestrak.com/satcat/satcat-format.asp

Supplemental reference information is provided tables in accompanying webpages :

Collection and Storage : Running sat_main.py script reads the source data into tables in a MySql database. All tables are created and data types are formatted prior to insertion. Subsequent runs of sat_main will update Mysql entries.

Datasource 2 : Union of Concerned Scientists (UCS) Database

URL : https://www.ucsusa.org/nuclear-weapons/space-weapons/satellite-database

This database includes Operational and User applications such as :

  • Orbit Type (LEO, MEO, GEO)
  • Ownership
  • Purpose (Communications, Earth Science, Navigation etc)
  • Users (Commercial, Military, Governent, Civil etc)
  • Launch Mass and Dry Mass
  • Satellite Manufacturer

The database contains over 1740 entries updated quarterly. It has fixed column structure, with fairly complete data although the categories are numerous. Entries are listed alphabetically.

Collection and Storage : Excel format file manually saved from the url into the /data folder and read directly into the notebook using Pandas excel_read().

Notebook Visualisations

Plotly : https://plot.ly/python/

All visualisation in this notebook run offline using the Plotly library.

Import Data

In [1]:
from IPython import get_ipython
get_ipython().magic('reset -sf')           # clear variables each time script is rerun

import pandas as pd
from sqlalchemy import create_engine, Table, Column, Integer, String, MetaData
import numpy as np
import matplotlib.pyplot as plt

import plotly
import plotly.plotly as py
import plotly.graph_objs as go
plotly.offline.init_notebook_mode() # run at the start of every notebook for offline plots

Extract Data from SQL Database

In [2]:
# edit these according to your database settings
config = {
    'user'     : 'apage002',                # change
    'password' : 'thomas',                  # change
    'host'     : 'localhost',               # no change if running Mysql on igor
    'database' : 'apage002_assignment1'     # change
}

Check Connection to MySQL Database & List tables available

In [3]:
# MySQL-connector-python - engine connector must be included in each cell
engine = create_engine('mysql+mysqlconnector://' + config['user'] + ':' + config['password'] + '@' + config['host'] + '/' + config['database'])

with engine.connect() as connection:                      # create connection object
    
    sql = connection.execute('SELECT VERSION()')          # query MySQL db
    data = sql.fetchall()[0][0]                           # return query result 
    print("Connected to db version : %s\n" % data)
           
    # List available tables in db (uses SqlAlhamy)
    meta = MetaData()
    meta.reflect(bind=engine)
    
    for table in meta.tables:
        print(table)
               
connection.close()                                        # close connection
print("\nFinished closed connection")
Connected to db version : 5.5.41

NORAD
Satcat
Satcat_Owners
Intelsat
UCS
Satbeams
SES
Satcat_LaunchSites

Finished closed connection

List Number of Entries in satcat table in MySQL

In [4]:
# MySQL-connector-python - engine connector must be included in each cell
engine = create_engine('mysql+mysqlconnector://' + config['user'] + ':' + config['password'] + '@' + config['host'] + '/' + config['database'])

with engine.connect() as connection:                                        # create connection object
     
    sql = connection.execute("""SHOW TABLE STATUS WHERE name='Satcat'""")   # query MySQL db
    data = sql.fetchall()                                                   # return query result 
    print("Satcat Table Last Modified : %s" % data[0][11]) 
 
    sql = connection.execute("""SELECT COUNT(*) FROM Satcat""")             # query MySQL db
    data = sql.fetchall()                                                   # return query result 
    print("Satcat Table Entries : %s" % data[0][0]) 
           
connection.close()                                                          # close connection
print("\nFinished closed connection")
Satcat Table Last Modified : 2018-01-06 13:56:10
Satcat Table Entries : 43071

Finished closed connection

^^^^ Note the current total of full entries in the Mysql satcat database

Import SATCAT data from MySQL Database into Dataframe

Selects only Payloads from the Satcat Table using the PFlag '*' variable NOTE SATCAT timestamp are not in date order

In [5]:
# MySQL-connector-python
engine = create_engine('mysql+mysqlconnector://' + config['user'] + ':' + config['password'] + '@' + config['host'] + '/' + config['database'])

df = pd.read_sql_query("""SELECT * FROM Satcat WHERE PFlag ='*'""", engine, parse_dates=['LaunchDate','DecayDate'])

# return dataframe containing all Payloads in the database
df.shape
Out[5]:
(7912, 19)

^^^^ Note that only the entries with Payload Flag are imported

In [6]:
df.head()            # show dataframe
Out[6]:
IntDes LaunchYear LaunchSeq Piece NORAD MFlag PFlag OpCode SatName Ownership LaunchDate LaunchSite DecayDate Period Inclin Apogee Perigee Radar OpStat
0 1957-001B 1957 1 B 2 * D SPUTNIK 1 CIS 1957-10-04 TYMSC 1958-01-03 96.1 65.0 945 227 0.000
1 1957-002A 1957 2 A 3 * D SPUTNIK 2 CIS 1957-11-03 TYMSC 1958-04-14 103.7 65.3 1659 211 0.080
2 1958-001A 1958 1 A 4 * D EXPLORER 1 US 1958-02-01 AFETR 1970-03-31 88.5 33.2 215 183 0.000
3 1958-002B 1958 2 B 5 * VANGUARD 1 US 1958-03-17 AFETR NaT 132.8 34.3 3835 649 0.122
4 1958-003A 1958 3 A 6 * D EXPLORER 3 US 1958-03-26 AFETR 1958-06-28 103.6 33.5 1739 117 0.000
In [7]:
# Define Categorical Data Type
df['Ownership'] = df['Ownership'].astype('category')

# satcat variable data types (apart from categorical) were defined during table creation
df.dtypes          
Out[7]:
IntDes                object
LaunchYear             int64
LaunchSeq              int64
Piece                 object
NORAD                  int64
MFlag                 object
PFlag                 object
OpCode                object
SatName               object
Ownership           category
LaunchDate    datetime64[ns]
LaunchSite            object
DecayDate     datetime64[ns]
Period               float64
Inclin               float64
Apogee                 int64
Perigee                int64
Radar                float64
OpStat                object
dtype: object

Import Launch Site Names table from Sql and return as dataframe

In [8]:
# MySQL-connector-python
engine = create_engine('mysql+mysqlconnector://' + config['user'] + ':' + config['password'] + '@' + config['host'] + '/' + config['database'])

df_Launchsites = pd.read_sql_query('SELECT * FROM Satcat_LaunchSites', engine)

# Drop 'id' and set SiteCode as Index
df_Launchsites.drop('id', inplace=True, axis=1)
df_Launchsites.set_index('SiteCode', inplace=True)

df_Launchsites.head()
Out[8]:
SiteName
SiteCode
TANSC Tanegashima Space Center, Japan
TAISC Taiyuan Space Center, PRC
SUBL Submarine Launch Platform (mobile), Russia
WLPIS Wallops Island, Virginia, USA
SRILR Satish Dhawan Space Centre, India (Formerly S...

Import Country Names table from Sql and convert to Dictionary

In [9]:
# MySQL-connector-python
engine = create_engine('mysql+mysqlconnector://' + config['user'] + ':' + config['password'] + '@' + config['host'] + '/' + config['database'])

df_Owners = pd.read_sql_query('SELECT * FROM Satcat_Owners', engine)

# Drop 'id' and set SiteCode as Index
df_Owners.drop('id', inplace=True, axis=1)
df_Owners.set_index('OwnCode', inplace=True)

df_Owners.head()
Out[9]:
OwnName
OwnCode
PRES People's Republic of China/European Space Agency
ANG Angola
MA Morocco
SAFR South Africa
KAZ Kazakhstan

Visualisation 1 - satcat Dataframe Entries

This graph visualises the quantity of database entries. Traces are drawn showing the total number of entries imported, how many have decayed, number that remain in orbit. This graph is used as verify data has been imported.

In [10]:
# create df Filtered Decayed Payloads
DEC = df.loc[(df['OpCode'] == 'D')]
DEC.set_index('LaunchDate', inplace=True);                          # set index to launchDate and sort the 
DEC = DEC.sort_index()                                              # sort the Launch Dates into date order
DEC.reset_index(inplace=True)

# create df Filtered Filter In Orbit Payloads
ON = df.loc[(df['OpCode'] != 'D')]
ON.set_index('LaunchDate');
ON = ON.sort_index()
ON.reset_index(inplace=True)

# create df Filtered Filter Active Payloads
PAY = df.loc[(df['OpCode'] == '+')]
PAY.set_index('LaunchDate');
PAY = PAY.sort_index()
PAY.reset_index(inplace=True)

# Create trace for plot total entries
trace1 = go.Scatter(
    x = df.LaunchDate.sort_values(),
    y = df.LaunchDate.index,
    name = 'Cataloged',
    marker = dict(color='blue'),
    opacity = 0.5,
)
# Create trace for plot Decayed Payloads
trace2 = go.Scatter(
    x = DEC.DecayDate.sort_values(),
    y = DEC.DecayDate.index,
    name = 'Decayed',
    marker = dict(color='red'),
    opacity = 0.5,
)
# Create trace for plot In Orbit Payloads
trace3 = go.Scatter(
    x = ON.LaunchDate.sort_values(),
    y = ON.LaunchDate.index,
    name = 'In Orbit Inactive',
    marker = dict(color='grey'),
    opacity = 0.5,
)
# Create trace for plot Active Payloads
trace4 = go.Scatter(
    x = PAY.LaunchDate.sort_values(),
    y = PAY.LaunchDate.index,
    name = 'Active',
    marker = dict(color='green'),
    opacity = 0.5,
)
data = go.Data([trace1, trace2, trace3, trace4])    # Plotly Frame must be list of dictionary 
layout = go.Layout(                                
    title = 'Historical SATCAT Payload entries',
    xaxis = dict(
        title = 'Year of Launch'
    ),
    yaxis = dict(
        title = 'Number of Payloads',
    ),
)
fig = go.Figure(data=data, layout=layout)
plotly.offline.iplot(fig)

Note : click on legend to select traces to be displayed

Visualisation : Primary purpose of this grpah is to visualise the timestamped database entries. It is used to validate the input data agains the one shown on the source website https://www.celestrak.com/satcat/boxscore.asp (nothing that for this notebook we have filtered the data entries on payloads.

Import From UCS Database file and Merge using NORAD Key

In [11]:
# Read UCS database in from Excel file
filename0 = 'data/UCS_Satellite_Database_9-1-2017.xlsx'
df_UCS = pd.read_excel('data/UCS_Satellite_Database_9-1-2017.xlsx', sheet_name='Sheet1')

# Process dataframe
df_UCS = df_UCS.iloc[:,1:26]       # select subset of columns to keep
cols = [9,10,11,12,13,16,17,21]
df_UCS.drop(df_UCS.columns[cols],axis=1,inplace=True)   # drop columns

# rename columns
df_UCS = df_UCS.rename(columns={'Country/Org of UN Registry':'UN_Registry',
                                'Country of Operator/Owner':'Country_Owner',
                                'Detailed Purpose':'Detailed_Purpose','Class of Orbit':'OrbitClass',
                                'Type of Orbit':'OrbitType', 'Longitude of GEO (degrees)':'Longitude',
                                'Operator/Owner':'Operator',
                                'Launch Mass (kg.)':'LaunchMass','Dry Mass (kg.)':'DryMass','Expected Lifetime':'Lifetime',
                                'Country of Contractor':'Contractor_Country', 'Launch Vehicle':'Launch_Vehicle',
                                'NORAD Number':'NORAD', 'COSPAR Number':'COSPAR'})

# define Datatypes
df_UCS['Longitude'] = df_UCS.Longitude.astype('float')
df_UCS.LaunchMass.fillna(0, inplace=True)
df_UCS['LaunchMass'] = df_UCS.LaunchMass.astype('float64')
df_UCS.DryMass.fillna(0, inplace=True)
df_UCS['DryMass'] = df_UCS.DryMass.astype('float64')

df_UCS['Country_Owner'] = df_UCS.Country_Owner.astype('category')
df_UCS['Operator'] = df_UCS.Operator.astype('category')
df_UCS['Users'] = df_UCS.Users.astype('category')
df_UCS['Detailed_Purpose'] = df_UCS.Detailed_Purpose.astype('category')
df_UCS['OrbitClass'] = df_UCS.OrbitClass.astype('category')
df_UCS['OrbitType'] = df_UCS.OrbitType.astype('category')
df_UCS['Contractor'] = df_UCS.Contractor.astype('category')
df_UCS['Contractor_Country'] = df_UCS.Contractor_Country.astype('category')
df_UCS['Launch_Vehicle'] = df_UCS.Launch_Vehicle.astype('category')

df2 = pd.merge(left=df, right=df_UCS, how='left', left_on='NORAD', right_on='NORAD', left_index=True)

df2.shape
Out[11]:
(7914, 35)

Payloads Launched by Owership

This plot illustrates the number of payloads launched by US and CIS (Russia) compared to the rest of the owners combined.

Date range can be adjusted using the slider to select smaller range. Dropdown selects group or click on the legend to select/deselected groups.

In [12]:
mytab = pd.crosstab(index=df['LaunchYear'],columns=df['Ownership'],margins=True)  # count payloads by year & owner

mytab2 = mytab[mytab.columns.difference(['US', 'CIS','All'])].drop(['All'],axis=0)  # remove US & CIS to calc reminder

# create traces and plot
trace1 = go.Scatter(
    x = mytab.index,
    y = mytab.All,
    name = 'All',
    marker = dict(color='grey', opacity=1.0, size=4),
    mode = 'lines+markers',
)
trace2 = go.Scatter(
    x = mytab.index,
    y = mytab.US,
    name = 'US',
    marker = dict(color='blue', opacity=1.0, size=4),
    mode = 'lines+markers',
)
trace3 = go.Scatter(
    x = mytab.index,
    y = mytab.CIS,
    name = 'CIS',
    marker = dict(color='red', opacity=1.0, size=4),
    mode = 'lines+markers',
)
trace4 = go.Scatter(
    x = mytab2.index,
    y = mytab2.sum(axis=1),
    name = 'Rest',
    marker = dict(color='green', opacity=1.0, size=4),
    mode = 'lines+markers',
)

data = [trace1, trace2, trace3, trace4]

updatemenus = list([
    dict(type = 'buttons',
         active = -1,
         buttons=list([   
            dict(label = 'All',
                 method = 'update',
                 args = [{'visible': [True, True, True, True]},
                         {'title': 'All',
                          'annotations': []}]),

            dict(label = 'Total',
                 method = 'update',
                 args = [{'visible': [True, False, False, False]},
                         {'title': 'All',
                          'annotations': []}]),
            dict(label = 'US',
                 method = 'update',
                 args = [{'visible': [False, True, False, False]},
                         {'title': 'US',
                          'annotations': []}]),
            dict(label = 'CIS',
                 method = 'update',
                 args = [{'visible': [True, False, True, False]},
                         {'title': 'CIS',
                          'annotations': []}]),
            dict(label = 'Rest',
                 method = 'update',
                 args = [{'visible': [False, False, False, True]},
                         {'title': 'Rest',
                          'annotations': []}])
        ]),
    )
])

layout = go.Layout(
    updatemenus = updatemenus,        # add dropdown menus
    dragmode = 'pan', 
    title = 'Number of Payloads Launched per Country',
    xaxis = dict(
        title = 'Year of Launch',
        rangeslider = dict(),         # xaxis range slider
        type='date'                   # xaxis range slider
    ),
    yaxis = dict(
        autorange =  False,
        range = [0,400],
    ),
)
fig = go.Figure(data=data, layout=layout)
plotly.offline.iplot(fig)

Note : Use the slider to to adjust the time axis.

Visualisation GEO Orbital Arc

This visualisation displays the longitude position of GEO satellites around the orbital arc. Satellite latitude for GEO satellites is 0 degrees, but in this visualisation satellites located within the same 1deg slot are represented as stack formation. Hence the latitude figure is set artifically for the purpose of the plot and can be ignored.

Each marker is color coded according to the User category taken from the UCS database and show in the index. Individual categories can be selected/deselected by clicking on the legend.

Map projection can be switched using the dropdown. And for orthonographic mode the map can be rotated 360degrees.

Hover text is provides each satellite name, longitude position and User category.

In [18]:
# filter only on GEO satellites
df_geo = df2[df2.OrbitClass == 'GEO']
df_geo.insert(0, 'stack',0)                                # create column to include stack 
df_geo.sort_values(by='Longitude',inplace=True)            # set index as Longitude and sort (needed for stack)
df_geo = df_geo.round({'Longitude': 0})                    # round longitude positions from float to integer
df_geo = df_geo.reset_index(drop=True)

# this loop iterates through the index and increments stack value where duplicate longitudes are found
s = 3.5    # s is height above the equator on the scattergeo plot set by visual inspection
for row in range(1,len(df_geo.index)):
    if df_geo.Longitude[row] == df_geo.Longitude[row-1]:
        df_geo.loc[row,'stack'] = df_geo.loc[row,'stack'] + s
        s = s+3.5
    else:
        s = 1

# create a trace list by looping through and User categories, marker colors are set automatically
trace1 = []                              
for user, X in df_geo.groupby('Users'): 
    trace = dict( 
                type = 'scattergeo',
                lon = X['Longitude'],
                lat = X['stack'],
                name = user,
                text = X['SatName'],
                hoverlabel = dict(
                    namelength = -1
                ),
        )
    trace1.append(trace)
    
# Initialize data object 
data = go.Data(trace1)            # this must be list of dictionary 

# create dropdown buttons to switch Map projection
updatemenus=list([
    dict(   
        active=1,
        buttons=list([   
            dict(
              args = ['geo.projection.type','Mercator'],
              label = 'Mercator', 
              method = 'relayout'
            ), 
            dict(
              args = ['geo.projection.type','orthographic'],
              label = 'orthographic', 
              method = 'relayout'
            ), 
       ]),
        xanchor = 'right',
        yanchor = 'top' 
    ),
])             
layout = go.Layout(
    title="Geostationary Satellites by Orbital Position",
    showlegend=True,
    geo=dict(
        projection=dict(type='orthographic'),     # 'Mercator'
        showland=True,
        landcolor='rgb(217, 217, 217)',
        showcoastlines=False,
        showframe=False,
        ),
    updatemenus=updatemenus, 
)

fig = go.Figure(data=data, layout=layout)
plotly.offline.iplot(fig,validate=False)
/opt/rh/rh-python35/root/usr/lib/python3.5/site-packages/ipykernel_launcher.py:4: SettingWithCopyWarning:


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy

Observation : This graph provides a clear visualisation including both detailed position and User information. We can observe the spread of operational satellites around the arc, zoom in and inspect individual name detail that could be used to seek more information. There are many positions where satellites are stacked in the same arc, two examples can be seen at 19 degrees East are 5 (green) ASTRA 1KR/1M/1F/1G/1L and again 7 more ASTRA (green) stacked at 28.2degrees East, these are both well known fleets used for TV broadcast by BSKYB over the UK. North America have a large cluster of Commercial payloads.

There is a large stack at 0deg longitude, on closer inspection these are plotted as the Longitude positions in the database are NULL values.

Visualisation : Scattergeo has been used as the map makes the longitude values meaningful on the world map. Initial visualisations using Polar Chart lacked the context the map provides. Stacking scatter points to visualise payloads makes easily recognisable chart, although only the first row of blue data points uses the correct latitude, this is why no vertical axis labels are used.

Future Improvement : Add dropdown that changes the color coding according to different categories, such as Purpose, Manufacturer etc. If hyerlink could be inserted this could link to the satellites wiki.

The problem is that the go.Data() only takes a single collection, 1 list type as argument, and in this case we have already used the list to iterate over the unique category names (using df.groupby('Users')) to avoid have to manually define each trace for each category. If we append another list to the first containing a second iterative trace (using df.groupby('Users')), its not permitted as single trace containing a dictionary type is expected. I'm sure there must be a method to handle this but i'm yet to find it.

It would also be nice if the orthographic rotation could have a fixed axis.

Visualisation Operational Payloads Utilisation by Orbit type

Interactive Bar charts summarising the Number of operational payloads by Users, Purpose and Launch Site taken from the UCS database.

Drop down menu switches between charts and individual orbits can be selected/deselected by clicking on the legend.

In [19]:
# frequency count for each case
tab1 = pd.crosstab(df2.OrbitClass,df2.Purpose, margins=False)
tab2 = pd.crosstab(df2.OrbitClass,df2.Users, margins=False)
tab3 = pd.crosstab(df2.OrbitClass,df2.LaunchSite, margins=False)

# slice each orbit from crosstab
geo1 = tab1.loc['GEO',:]
meo1 = tab1.loc['MEO',:]
leo1 = tab1.loc['LEO',:]
ell1 = tab1.loc['Elliptical',:]

geo2 = tab2.loc['GEO',:]
meo2 = tab2.loc['MEO',:]
leo2 = tab2.loc['LEO',:]
ell2 = tab2.loc['Elliptical',:]

geo3 = tab3.loc['GEO',:]
meo3 = tab3.loc['MEO',:]
leo3 = tab3.loc['LEO',:]
ell3 = tab3.loc['Elliptical',:]

# filter smallest numbers to reduce number of categories plotted
geo1 = geo1[geo1 > 5].sort_values(ascending=False)
meo1 = meo1[meo1 > 5].sort_values(ascending=False)
leo1 = leo1[leo1 > 5].sort_values(ascending=False)
ell1 = ell1[ell1 > 5].sort_values(ascending=False)

geo2 = geo2[geo2 > 5].sort_values(ascending=False)
meo2 = meo2[meo2 > 5].sort_values(ascending=False)
leo2 = leo2[leo2 > 5].sort_values(ascending=False)
ell2 = ell2[ell2 > 5].sort_values(ascending=False)

geo3 = geo3[geo3 > 5].sort_values(ascending=False)
meo3 = meo3[meo3 > 5].sort_values(ascending=False)
leo3 = leo3[leo3 > 5].sort_values(ascending=False)
ell3 = ell3[ell3 > 5].sort_values(ascending=False)

# All traces are grouped into a single collection
trace1  = go.Bar(x = leo1.index, y = leo1, name = 'LEO',        visible = True)
trace2  = go.Bar(x = geo1.index, y = geo1, name = 'GEO',        visible = True) 
trace3  = go.Bar(x = meo1.index, y = meo1, name = 'MEO',        visible = True) 
trace4  = go.Bar(x = ell1.index, y = ell1, name = 'Elliptical', visible = True) 

trace5  = go.Bar(x = leo2.index, y = leo2, name = 'LEO',        visible = False)
trace6  = go.Bar(x = geo2.index, y = geo2, name = 'GEO',        visible = False)  
trace7  = go.Bar(x = meo2.index, y = meo2, name = 'MEO',        visible = False)
trace8  = go.Bar(x = ell2.index, y = ell2, name = 'Elliptical', visible = False) 

trace9  = go.Bar(x = leo3.index, y = leo3, name = 'LEO',        visible = False)
trace10 = go.Bar(x = geo3.index, y = geo3, name = 'GEO',        visible = False)  
trace11 = go.Bar(x = meo3.index, y = meo3, name = 'MEO',        visible = False)  
trace12 = go.Bar(x = ell3.index, y = ell3, name = 'Elliptical', visible = False)

data = go.Data([trace1,trace2,trace3,trace4,trace5,trace6,trace7,trace8,trace9,trace10,trace11,trace12])
updatemenus = list([
    dict(active=0,
         buttons=list([     # each trace visibility setting is set using the dropdown menu
            dict(
              args = [{'visible':[True, True, True, True, False, False, False, False, False, False, False, False]},
                      {'title':'Purpose'}],
              label = 'Purpose', 
              method = 'update',   # (restyle,relayout,update,animate
            ),
            dict(
              args = [{'visible':[False, False, False, False, True, True, True, True, False, False, False, False]},
                      {'title':'Users'}],
              label = 'Users', 
              method = 'update',   # (restyle,relayout,update,animate
            ),
            dict(
              args = [{'visible':[False, False, False, False, False, False, False, False, True, True, True, True]},
                      {'title':'Launch Site for Orbit Payloads (only)'}],
              label = 'Launch Site', 
              method = 'update',   # (restyle,relayout,update,animate
            ),
      ]), 
      xanchor = 'right',
      yanchor = 'top',
    )
])
layout = go.Layout(
    barmode = 'stack',
  #  title='Launch Site Frequency',
    yaxis=dict(
        title='Number of Payloads',
    ),
    updatemenus=updatemenus, 
)
fig = go.Figure(data=data, layout=layout)
plotly.offline.iplot(fig,validate=False)

Observation : Purpose chart, there are more LEO payloads than any others, and they are used for a range of appplications from earth obsersation, communications to space science and technology demonstration. MEO payloads are Navigation (such as GPS, GALILEO and GLONASS). GEO are used mainly used for Telecommunications. Ellipical orbit is a special orbit used for polar weather and high latitude coverage.

Users chart, we observe a 50/50 split for LEO/GEO payloads for commercial Users.

Launch Site Chart, there are some interesting insights. LEO payloads are launched from a wider range of sites than others, typically they are small so can be launched from smaller facilities. GEO launches are spread over only 4 sites likely due to the larger size and facilities need to be bigger. Europes spaceport in French Guiana is the only site to launch more GEOs than other types, this is due to the Ariane 5 dual launcher.

Visualisation : Interactive bar charts provide a good amount of insights while featuring a dropdown allows different data to be visualised on the same chart.

Further Improvements : additional categories can be added to the dropdown

Visualisation Launch Sites Historical Analysis

Interactive Pie chart displaying the proportion of total recorded launches at each site.

Dropdown switches view between top 5 to improve clarity on the chart.

Moving the cursor over the chart displays the Full site information merged from the satcat annex.

Note this chart is different from the bar chart above which uses only Operational payloads, whereas this chart use all payloads launched.

In [20]:
# Group Launchsites and count occurances
df_sites = df.groupby('LaunchSite').size().sort_values(ascending = False).to_frame()
df_sites.columns = ['count']

# Merge Full Launchsite names from SATCAT Annex
df_sites = pd.merge(df_sites, df_Launchsites, right_index=True, left_index=True)
 
#Select only top 5
df_sites_top5 = df_sites.head(5) 

# Add up the remaining sites and add to top5 as 'Others' to keep total percentage correct
df_sites_rest = df_sites.index.difference(df_sites_top5.index)
df_sites_rest = df_sites.loc[df_sites.index.difference(df_sites_top5.index),:]
data = [[df_sites_rest['count'].sum(), 'Others']]
others = pd.DataFrame(data, columns=['count','SiteName'], index=['Others'])
df_sites_top6 = df_sites_top5.append(others)

trace1 = go.Pie(labels = df_sites_top6.index, 
                values = df_sites_top6['count'], 
                textinfo = 'label+value+percent',
                hovertext = df_sites_top6['SiteName'],
                insidetextfont=dict(color='white'),
                visible = True,
)
trace2 = go.Pie(labels = df_sites_rest.index, 
                values = df_sites_rest['count'], 
                textinfo = 'label+value+percent',
                hovertext = df_sites_rest['SiteName'],
                insidetextfont=dict(color='white'),
                visible = False,
)

data = go.Data([trace1, trace2])
updatemenus = list([
    dict(active=0,
         buttons=list([
            dict(
              args = [{'visible':[True, False]},
                      {'title':'Top 5 Launch Sites'}],
              label = 'Top 5', 
              method = 'update',   # (restyle,relayout,update,animate
            ),
            dict(
              args = [{'visible':[False, True]},
                      {'title':'Others'}],
              label = 'Others', 
              method = 'update',   # (restyle,relayout,update,animate
            ),
      ]), 
      xanchor = 'right',
      yanchor = 'top',
    )
])
layout = go.Layout(
    title='Launch Site Frequency',
    showlegend=False,
    updatemenus=updatemenus, 
)
fig = go.Figure(data=data, layout=layout)
plotly.offline.iplot(fig,validate=False)

Observation : Plesetek (Russia) and Baiknour (Kazakstan) each have 25% of the total payload launch count while another 27% is taken by the two US sites. European Spaceport Korou has only 6% of the total count and the remainder is made up of other sites.

It is useful to consider this chart in relation to the previous bar chart of launches for different orbit types, as while Korou (FRGUI) has launched only 6% of the total it has launched the highest number of GEOs.

Animation Historical Payload Launches

Animation of all recorded payload lauunches since 1957 to date, plotted according to Launch Data and Apogee altitude with color coding representing Launch Site.

Animation can be executed using the Play/Pause buttons or slider can be set manually. Individal traces can be selected/deselecting using the slider.

Hovering over the marker provides satellite name, launch sate and altitude.

The template for this plot was taken from the Plotly website https://plot.ly/python/animations/

In [21]:
years = df['LaunchYear'].unique().astype(str).tolist() 
sites = df['LaunchSite'].unique().astype(str).tolist()    # AP rewrite
owners = df['Ownership'].unique().astype(str).tolist()

# make figure  - For the slider to appear, we need to adda sliders dictionary to layout. The sliders dictionary is set in the following way:
figure = {
    'data': [],
    'layout': {},
    'frames': []
}

# fill in most of layout
figure['layout']['title'] = 'Historical Payload Launches per Country'
figure['layout']['xaxis'] = {'nticks':60,'anchor':'y','autorange':False, 'range':[pd.datetime(1957,1,1),pd.datetime(2017,1,1)],'title': 'Year of Launch'} 
figure['layout']['yaxis'] = {'anchor':'y','autorange':False, 'range':[0,50000],'title': 'Altitude (km)'}
figure['layout']['hovermode'] = 'closest'
figure['layout']['sliders'] = {
    'args': [
        'transition', {
            'duration': 20,
            'easing': 'elastic'
        }
    ],
    'initialValue': '1957',
    'plotlycommand': 'animate',
    'values': years,
    'visible': True
}
figure = {
    'data': [],
    'layout': {},
    'frames': []
}

# fill in most of layout
figure['layout']['title'] = 'Historical Payload Launches per Country'
figure['layout']['xaxis'] = {'nticks':60,'anchor':'y','autorange':False, 'range':[pd.datetime(1957,1,1),pd.datetime(2017,1,1)],'title': 'Year of Launch'} 
figure['layout']['yaxis'] = {'anchor':'y','autorange':False, 'range':[0,50000],'title': 'Altitude (km)'}
figure['layout']['hovermode'] = 'closest'
figure['layout']['sliders'] = {
    'args': [
        'transition', {
            'duration': 20,
            'easing': 'elastic'
        }
    ],
    'initialValue': '1957',
    'plotlycommand': 'animate',
    'values': years,
    'visible': True
}

# Add Slider
figure['layout']['updatemenus'] = [
    {
        'buttons': [
            {
                'args': [None, {'frame': {'duration': 100, 'redraw': False},
                         'fromcurrent': True, 'transition': {'duration': 100}}], #, 'easing': 'quadratic-in-out'
                'label': 'Play',
                'method': 'animate'
            },
            {
                'args': [[None], {'frame': {'duration': 0, 'redraw': False}, 'mode': 'immediate',
                'transition': {'duration': 0}}],
                'label': 'Pause',
                'method': 'animate'
            }
        ],
        'direction': 'left',
        'pad': {'r': 10, 't': 20},
        'showactive': False,
        'type': 'buttons',
        'x': 0.1,
        'xanchor': 'right',  # xanchor is similar, only with left and right as possible values
        'y': 0,
        'yanchor': 'top'    # yanchor determines whether the slider is on the top or bottom of the chart page
    }
]

sliders_dict = {
    'active': 0,
    'yanchor': 'top',
    'xanchor': 'left',
    'currentvalue': {           # currentvalue sets the display of the current value that the slider is hovering on. 
        'font': {'size': 20},   # It contains args such as prefix, which sets the text that appears before the value.
        'prefix': 'Year:',
        'visible': True,
        'xanchor': 'right'
    },
    'transition': {'duration': 20},#, 'easing': 'cubic-in-out'},
    'pad': {'b': 10, 't': 20},
    'len': 0.9,
    'x': 0.1,
    'y': 0,
    'steps': []  # steps is a list of dictionaries each of which corresponds to a frame in the figure. 
                 # They should be ordered in the sequence in which the frames occur in the animation.
}
sliders_dict = {
    'active': 0,
    'yanchor': 'top',
    'xanchor': 'left',
    'currentvalue': {
        'font': {'size': 20},
        'prefix': 'Year:',
        'visible': True,
        'xanchor': 'right'
    },
    'transition': {'duration': 20},#, 'easing': 'cubic-in-out'},
    'pad': {'b': 10, 't': 20},
    'len': 0.9,
    'x': 0.1,
    'y': 0,
    'steps': []
}

# make data
year = 1957
#for owner in owners:
for site in sites:
    df_by_year = df[df['LaunchYear'] <= year]
    df_by_year_and_owner = df_by_year[df_by_year['LaunchSite'] == site]

    data_dict = {
        'x': df_by_year_and_owner['LaunchDate'],
        'y': df_by_year_and_owner['Apogee'],
        'mode': 'markers',
        'text': df_by_year_and_owner['SatName'],
        'marker': {
        #    'sizemode': 'area',
        #    'sizeref': 200000,
            'size': 10, # list(dataset_by_year_and_owner['Inclin'])
        },
        'name': site,
        'visible' : True
    }
    figure['data'].append(data_dict)

# make frames
for year in years:
    frame = {'data': [], 'name': str(year)}
#    for owner in owners:
    for site in sites:
        df_by_year = df[df['LaunchYear'] <= int(year)]
        df_by_year_and_owner = df_by_year[df_by_year['LaunchSite'] == site]

        data_dict = {
            'x': df_by_year_and_owner['LaunchDate'],
            'y': df_by_year_and_owner['Apogee'],
            'mode': 'markers',
            'text': df_by_year_and_owner['SatName'],
            'marker': {
            #    'sizemode': 'area',
            #    'sizeref': 200000,
                'size': 10,   #list(dataset_by_year_and_owner['Inclin'])
            },
            'name': site,
            'visible' : True
        }
        frame['data'].append(data_dict)

    figure['frames'].append(frame)
    
    slider_step = {'args': [
        [year],
        {'frame': {'duration': 20, 'redraw': True},
         'mode': 'immediate',
         'transition': {'duration': 20}}
         ],
         'label': year,
         'method': 'animate'}
    sliders_dict['steps'].append(slider_step)

# Plot the figure
figure['layout']['sliders'] = [sliders_dict]
plotly.offline.iplot(figure,validate=False)

Observations : Animation visualises the distribution of payloads over orbital altitude and time. LEO/MEO/GEO belts are clearly visible. Middle MEO cluster shows the introduction of GPS (NAVSTAR)and GLONASS satellites used for Navigation.

Further work : It would be nice to add the decay date so as the animation progresses satellite decay aswell as launch. It was also intended to size the marker bubbles according to payload dry mass taken from the UCS database, however soon become too cluttered.

Discussion

Datasets satcat provides a reliable up to date source with consistent fixed format. The majority of the task involved he collection from source as CSV, parsing and storing in Mysql. While this took some coding it only had to be done once. It was only towards the end the LaunchDate order was not 100% ordered which I would have noted earlier by plotting the first chart. Also I should have used the LaunchDate as datetime index from the start, and done a sort_index for good measure.

UCS is less consistent and reliable. I suspect it has been created manually and could do with additional cleaning, for example some numeric columns have occassional string characters (such as + or (EOL) which could be removed with re if these columns were to be used. Columns names are in long strings with whitespace and escape characters and had to be rewritten. There are a few too many categories that could be reduced.

MySQL table creation and insertion was at first timeconsuming due to the constraints of the INSERT string, but once automated this not a problem. I wanted to use the SQL query more extensively but having reduced the row count to <8k on import to notebook it was just as easy for Pandas to do the hard work. I got caught out with INSERT delimiter issue which I only noticed later "Europe's Korou Spacecentre" just did not appear until it was caught and fixed. Accurate setting of table data types was useful.

SqlAlchamy useful for db queries but pandas has single line wrapper. Didnt need for INSERT as still uses SQL format in any case.

Plotly is a great visualisation library particularly the offline interactive features. However the User Manual is undetailed, Attribute reference table is incomplete and animation is not included. There are user uploaded examples but many dont execute and are badly written without loop for trace creation. Single collection limitation was a constraint.

Further Work

In part 1 information spacecraft details were webscraped and stored in the Mysql database. The intention was to search for key satellite name to search and extract fractual information. There was not time to complete as part of this exercise.

Automate periodic check of datasources at the start of each notebook run to ensure its up to date.

Calculate total Launch weight to see which launch sites lifted the most by mass.

Create calender table in SQL to allow the Launch Date-Decay Date period to be used to count number of payloads in orbit on any calender date.

References

https://www.celestrak.com/satcat/search.asp https://www.ucsusa.org/nuclear-weapons/space-weapons/satellite-database https://plot.ly/python/